Back

npj Precision Oncology

Springer Science and Business Media LLC

Preprints posted in the last 7 days, ranked by how well they match npj Precision Oncology's content profile, based on 48 papers previously published here. The average preprint has a 0.04% match score for this journal, so anything above that is already an above-average fit.

1
Consensus Through Diversity: A Comprehensive Benchmark of Multi-Omic Approaches for Precision Breast Oncology

Sionakidis, A.; Pinilla Alba, K.; Abraham, J.; Simidjievski, N.

2026-04-21 bioinformatics 10.64898/2026.04.17.719159 medRxiv
Top 0.1%
7.2%
Show abstract

Emerging multi-omic profiling has made it feasible to subtype disease using multiple molecular layers. However, inconsistent preprocessing, heterogeneous implementations, variable evaluation, and limited reproducibility often constrain method selection. Here, we systematically benchmark 22 publicly available unsupervised approaches for bulk data on the TCGA-BRCA cohort across five modalities (RNA-seq, miRNA, DNA methylation, copy numbers, single nucleotide polymorphisms) and validate findings in two independent datasets, enabling a multi-layered comparison of performance, heterogeneous data support and interpretability. Most approaches fuse multi-omic data to produce a two-cluster solution largely aligned with ER status, with higher-resolution approaches further refining these into four coherent subclasses (angiogenic luminal, oxidative-phosphorylation/HER2-low luminal, immune-inflamed basal-like, and hyper-proliferative basal-like). Our benchmarking results indicate that methods based on similarity networks can efficiently produce stable, reliable partitions. Matrix factorisation and Bayesian factorisation algorithms produce rich latent representations, allowing quantification of feature and modality contributions, albeit at higher computational cost. Consensus clustering can be used on a case-by-case basis and refine partitions into more robust and generalisable findings. We aggregate our insights into a decision workflow that aligns with study goals, data characteristics, and computational resources, enabling optimal analytic strategies. This comprehensive assessment provides a practical roadmap for investigators seeking to extract reproducible, biologically meaningful subtypes from complex multi-omic datasets. We higlight the different technical and practical benefits and trade-offs that shape the selection and development of multi-omic approaches applied in precision oncology.

2
Mechanistic learning to predict and understand minimal residual disease

Marzban, S.; Robertson-Tessi, M.; West, J.

2026-04-21 cancer biology 10.64898/2026.04.16.718968 medRxiv
Top 0.1%
6.8%
Show abstract

Mechanistic modeling has long been used as a tool to describe the dynamics of biological systems, especially cancer in response to treatment. Their key advantage lies in interpretability of relationships between input parameters and outcomes of interest. In contrast, machine learning techniques offer strong prediction performance, especially for high dimensional datasets that are common in oncology. Here, we employ a Mechanstic Learning framework that combines the advantages of both approaches by training machine learning models on mechanistic parameters inferred from clinical patient data. The mechanistic model (a Markov chain model) contains sixteen parameters that describe the rate of cell fate transitions that occur in patients with B-cell precursor acute lymphoblastic leukemia. The machine learning (a ridge logistic regression model) is trained on these parameters to predict two clinically-relevant features: BCR::ABL1 fusion gene status (positive or negative) and minimal residual disease status (positive or negative) post-induction chemotherapy. Model training is done in an iterative fashion to assess which (and how many) parameters are critical to maintain high predictive performance. Using machine learning models trained on the clinical flow-cytometry data, we find that the stem-like cell state alone is the most predictive feature for both BCR::ABL1-positive and MRD-positive disease, with combination scores (defined as the average of accuracy, balanced accuracy, and area under the curve) of 0.80 and 0.67, respectively. By comparison, mechanistic learning achieves comparable or improved combination scores for BCR::ABL1-positive and MRD-positive disease, with scores of 0.81 and 0.71, respectively, using only de-differentiation for BCR::ABL1 and primitive-state persistence together with differentiation-directed exit for MRD. Thus, the mechanistic-learning approach not only preserves predictive performance, but also provides a biological hypothesis for why stemness is predictive of these clinically relevant outcomes.

3
A Context-Aware Target Engagement and Pharmacodynamic Biomarker Resource to Accelerate Drug Discovery and Development

Yang, Y.; Zhao, L.; Orouji, S.; Zhu, Y.; Johnson, R. L.; Maxwell, D. S.; Mica, I.; Russell, K. P.; Al-lazikani, B.

2026-04-22 bioinformatics 10.64898/2026.04.19.719411 medRxiv
Top 0.1%
6.3%
Show abstract

Confirming target engagement in tumor experimental models remains a major challenge in oncology drug development. Pharmacodynamic biomarkers can help address this, but few systematic resources link drug targets to candidate biomarkers. We developed TargetTrace, a comprehensive resource to identify and prioritize pharmacodynamic biomarkers across nine key target classes, including transcription factors/cofactors, kinases, phosphatases, ubiquitin ligases, deubiquitinases, acetyltransferases, deacetylases, methyltransferases, and demethylases. Biomarker candidates were gathered from curated molecular interaction resources and refined using external annotations to improve accuracy. For enzyme targets with measurable substrate changes, we applied a two-agent large language model workflow, followed by manual review, to harmonize antibody information from the antibody resources and ensure that the selected biomarkers are measurable with existing laboratory tests. From more than 92,000 input interactions and over 2,300 targets, we compiled 71,323 target-biomarker relationships involving 2,270 potential drug targets, encompassing both transcription factor/cofactor-target gene and enzyme-substrate interactions. Commercial antibodies were available for over 1,400 biomarkers, supporting laboratory validation. This resource provides a structured and reusable resource for systematic identification and prioritization of pharmacodynamic biomarkers in oncology.

4
Gut Microbiome as a Diagnostic Biomarker for Early Cancer Detection: A Systematic Review and Meta-Analysis of 18 Studies across Five Cancer Types

TALL, M. l.

2026-04-22 cancer biology 10.64898/2026.04.19.719461 medRxiv
Top 0.1%
4.9%
Show abstract

BackgroundThe gut microbiome has emerged as a promising non-invasive biomarker for early cancer detection. However, evidence remains fragmented across individual studies with limited cross-cancer comparisons. ObjectivesTo systematically evaluate the diagnostic accuracy of gut microbiome-based signatures across five major cancer types: colorectal cancer (CRC), gastric cancer (GC), pancreatic ductal adenocarcinoma (PDAC), hepatocellular carcinoma (HCC), and lung cancer (LC). MethodsWe conducted a systematic literature search in PubMed, Embase, and Web of Science (January 2000 - April 2026), following PRISMA 2020 guidelines. Studies reporting area under the receiver operating characteristic curve (AUC) for microbiome-based cancer classification were included. Pooled AUC estimates were derived using a DerSimonian-Laird random-effects model. Study quality was assessed using the Newcastle-Ottawa Scale (NOS). ResultsEighteen studies (2,587 participants) met inclusion criteria. Pooled AUC values were: CRC 0.785 (95%CI 0.750-0.819; I2=30.6%), GC 0.834 (0.781-0.887; I2=56.6%), PDAC 0.853 (0.785-0.921; I2=60.8%), HCC 0.809 (0.747-0.871; I2=70.3%), and LC 0.780 (0.738-0.822; I2=25.0%). Fusobacterium nucleatum was consistently enriched across CRC, GC, and PDAC, while Faecalibacterium prausnitzii and Akkermansia muciniphila were depleted in all five cancer types. Porphyromonas gingivalis showed the highest fold-change in PDAC (log{blacksquare}FC=+2.8). Risk of bias was moderate-to-high in all studies. ConclusionsGut microbiome profiling demonstrates good-to-excellent diagnostic accuracy (AUC 0.78-0.85) across five major cancer types. Shared cross-cancer biomarkers suggest common dysbiotic mechanisms amenable to pan-cancer screening. These findings support integration of microbiome signatures into multi-modal cancer detection platforms.

5
Attention-Guided CNN Ensemble for Binary Classification of High-Grade and Low-Grade Serous Ovarian Carcinoma from Histopathological WSI Patches

rani, a.; mishra, s.

2026-04-22 oncology 10.64898/2026.04.21.26351441 medRxiv
Top 0.2%
3.7%
Show abstract

Accurate histopathological differentiation between High-Grade Serous Carcinoma (HGSC) and Low-Grade Serous Carcinoma (LGSC) remains a critical yet challenging aspect of ovarian cancer diagnosis due to their similar morphology and different clinical outcomes. This study presents a deep learning framework that uses custom attention mechanisms, including the Convolutional Block Attention Module (CBAM), Squeeze-and-Excitation (SE) blocks, and a Differential Attention module within five CNN architectures for automated binary classification of ovarian cancer subtypes from H&E WSI patches. Although individual models achieved higher accuracy, the ensemble stacking framework with a shallow MLP meta-learner delivered the best overall performance, with a ROC-AUC of 0.9211, an accuracy of 0.85, and F1-scores of 0.84 and 0.85 across both subtypes. These findings demonstrate that attention-guided feature recalibration combined with ensemble stacking provides robust and clinically interpretable discrimination of ovarian carcinoma subtypes.

6
RNABag: A Generalizable Transcriptome Foundation Model for Precision Oncology across Biopsy Modalities

Luo, P.; Luo, D.; Li, D.; Xue, X.; Yang, J.; Gong, X.; Tang, K.

2026-04-22 bioinformatics 10.64898/2026.04.19.719450 medRxiv
Top 0.2%
3.6%
Show abstract

Transcriptomic data is highly sensitive to cancer state and progression, making transcriptome-based foundation models a great promise for diverse clinical ontological inference. However, analyses of transcriptome are conventionally hindered by technical batch effects and limited generalization across platforms. Here, we introduce RNABag, a foundation model designed to generalize well to external datasets. In particular, the model only focuses on highly variable genes to reduce noise; and extensive data augmentation was utilized to pretrain RNABag to learn robust representations, invariant to batch variations. We demonstrate that RNABag achieves superior performance in pan-cancer tissue-of-origin classification and cancer detection in internal validation sets, as well as in zero-shot generalization to external cohorts and in-house clinical samples. Furthermore, RNABag, after specialized finetuning, exhibits strong capabilities in a wide range of clinical applications. The model effectively stratifies patient survival and predicts relapse risks, highlighting key molecular pathways driving tumor progression. Crucially, we extend RNABags utility to liquid biopsies, achieving high diagnostic accuracy in plasma cfRNA and tumor-educated platelets (TEPs), thereby supporting its application in non-invasive cancer monitoring. Interpretability analysis revealed pivotal role of tumor immune escape in the cancer induced plasma cfRNA signals. In summary, our study indicates that cancer states and progression may be monitored in details and precision via comprehensive modeling of transcriptome across biopsy modalities.

7
A prognostic signature based on ectopic reactivation of eight tissue-specific genes in Diffuse Large B Cell Lymphoma.

Montaut, E.; Rainville, V.; Betton-Fraisse, P.; Merre, W.; Khedimallah, S.; Govin, J.; Rousseaux, S.; Khochbin, S.; Jardin, F.; Ruminy, P.; Bourova-Flin, E.; Emadali, A.; Carras, S.

2026-04-27 hematology 10.64898/2026.04.23.26351580 medRxiv
Top 0.2%
3.1%
Show abstract

Diffuse Large B-cell lymphoma (DLBCL) is the most common aggressive lymphoma in the Western world. First-line immunochemotherapy fails in approximately 30-40% of patients, with refractory and relapse patients presenting a dismal prognosis. Currently, these high-risk patients cannot be accurately identified at diagnosis. Using statistical modeling and machine learning approaches applied to large public DLBCL datasets, we identified a novel predictive signature based on the reactivation of eight normally silent tissue-dependent genes associated with survival. We then developed a multiplex RT-MLPseq based assay, compatible with formalin-fixed paraffin-embedded (FFPE) samples and transferable into routine clinical practice, enabling analysis of expression of these eight genes and validated their prognosis impact in an independent real-life cohort. This signature could be integrated with current prognostic indices and molecular classifications to improve patient stratification and guide treatment selection toward a personalized theragnostic approach, thereby enhancing management of non-responder patients.

8
Transcriptomic subtypes in high-grade serous ovarian cancer are driven by tumor cellular composition

Tanis, S.; Lixandrao, M.; Ivich, A.; Grieshober, L.; Lawson-Michod, K. A.; Collin, L. J.; Peres, L. C.; Salas, L. A.; Marks, J. R.; Bitler, B. G.; Greene, C. S.; Schildkraut, J. M.; Doherty, J. A.; Davidson, N. R.

2026-04-21 cancer biology 10.64898/2026.04.16.719000 medRxiv
Top 0.2%
3.1%
Show abstract

High-grade serous ovarian carcinoma (HGSC) is an aggressive malignancy for which bulk transcriptomic subtypes are used to stratify tumors, interpret biology, and guide biomarker development. The four TCGA-derived subtypes, mesenchymal (C1.MES), immunoreactive (C2.IMM), proliferative (C5.PRO), and differentiated (C4.DIF), are consistently observed across cohorts. However, despite their prominence, these subtypes have not translated into therapeutic utility, and their biological basis remains unresolved. Here, we show that HGSC transcriptomic subtypes are largely determined by tumor cellular composition rather than intrinsic malignant transcriptional programs. By integrating controlled single-cell-derived pseudobulk simulations with deconvolution-based analysis of 1,834 primary HGSC tumors across RNA-seq and microarray cohorts, we demonstrate that subtype probabilities align along a composition-driven axis of stromal and immune variation. Cellular composition alone predicted subtype labels with high accuracy (ROC-AUC = 0.81-0.95) and explained a substantial fraction of subtype-associated transcriptomic variation, with the mesenchymal (C1.MES) subtype representing the most robust and reproducible example of composition-driven signal. Although a secondary, composition-independent expression signal is detectable, it does not define the dominant structure of subtype classification. These findings redefine HGSC transcriptomic subtypes as features of the tumor ecosystem rather than discrete malignant states. This reinterpretation has immediate implications for studies that use subtype labels to infer tumor-intrinsic biology and provides a generalizable framework for separating composition-driven and intrinsic signals in bulk tumor data. Significance StatementHGSC transcriptomic subtypes lack consistent clinical utility and remain biologically ambiguous. We show subtype assignments are largely driven by tumor cellular composition, and less so by distinct intrinsic tumor states.

9
CT-Based Deep Foundation Model for Predicting Immune Checkpoint Inhibitor-Induced Pneumonitis Risk in Lung Cancer

Muneer, A.; Showkatian, E.; Kitsel, Y.; Saad, M. B.; Sujit, S. J.; Soto, F.; Shroff, G. S.; Faiz, S. A.; Ghanbar, M. I.; Ismail, S. M.; Vokes, N. I.; Cascone, T.; Le, X.; Zhang, J.; Byers, L. A.; Jaffray, D.; Chang, J. Y.; Liao, Z.; Naing, A.; Gibbons, D. L.; Vaporciyan, A. A.; Heymach, J. V.; Suresh, K. S.; Altan, M.; Sheshadri, A.; Wu, J.

2026-04-23 oncology 10.64898/2026.04.21.26351428 medRxiv
Top 0.4%
2.1%
Show abstract

Background: Immune checkpoint inhibitors (ICIs) have revolutionized cancer therapy but can cause serious immune-related adverse events (irAEs), with pneumonitis (ICI-P) being among the most severe. Early identification of high-risk patients before ICI initiation is critical for closer monitoring, timely intervention, and improved outcomes. Purpose: To develop and validate a deep learning foundation model to predict ICI-P from baseline CT scans in patients with lung cancer. Methods: We designed the Checkpoint-Inhibitor Pneumonitis Hazard EstimatoR (CIPHER), a deep learning foundation model that combines contrastive learning with a transformer-based masked autoencoder to predict ICI-P from baseline CT scans in patients with lung cancer. Using self-supervised learning, CIPHER was pre-trained on 590,284 CT slices from 2,500 non-small cell lung cancer (NSCLC) patients to capture heterogeneous lung parenchymal patterns. After pre-training, the model was fine-tuned on an internal NSCLC cohort for ICI-P risk prediction, using images from 254 patients for model development and 93 patients for internal validation. We compared CIPHER with classical radiomic models and further evaluated it on an external NSCLC cohort of 116 patients. Results: In the internal immunotherapy cohort, CIPHER consistently distinguished patients at elevated risk of ICI-P from those without the event, with AUCs ranging from 0.77 to 0.85. In head-to-head benchmarking, CIPHER achieved an AUC of 0.83, outperforming the radiomic models. In the external validation cohort, CIPHER maintained strong performance (AUC = 0.83; balanced accuracy = 81.7%), exceeding the radiomic models (DeLong p = 0.0318) and demonstrating higher specificity without sacrificing sensitivity. By contrast, the radiomic model showed high sensitivity (85.0%) but markedly lower specificity (45.8%). Confusion matrix analysis confirmed the robust classification performance of CIPHER, correctly identifying 80 of 96 non-ICI-P cases and 16 of 20 ICI-P cases. Conclusions: We developed and externally validated CIPHER for predicting future risk of ICI-P from pre-treatment CT scans. With prospective validation, CIPHER may be incorporated into routine patient management to improve outcomes.

10
ClonoScreen3D-CRISPRi Uncovers Genetic Modifiers of Radiation Response in Glioblastoma

Lee, S.; Husmann, A.; Li, J.; Li, C. Z.; Modi, S.; Ahmad, S.; Mackay, S.; Paul, A.; Jackson, M. R.; Chalmers, A. J.; McCarthy, N.; Gomez-Roman, N. J.; Bello, E.

2026-04-21 cancer biology 10.64898/2026.04.17.719014 medRxiv
Top 0.4%
1.9%
Show abstract

Background: Glioblastoma (GBM) is the most aggressive primary brain tumor in adults. Radioresistance, partly mediated by glioma stem-like cells, represents a major clinical challenge which could be overcome by the identification of the modulators of radioresistance. Existing CRISPR screens in human GBM models have largely used two-dimensional cultures with short-term viability readouts, failing to capture the long-term clonogenic behaviour underlying tumour recurrence after radiotherapy. Method: We developed ClonoScreen3D-CRISPRi, combining CRISPRi-mediated gene knockdown with three-dimensional clonogenic survival assays. Two GBM cell lines (G7 and GBML20), differing in MGMT promoter methylation status, were engineered to express the KRAB-dCas9 editor. Nine candidate radiosensitivity modifiers, selected through transcriptomic analysis, pharmacological studies, and literature review, were examined in both lines. Target validation was performed using full radiation dose-response assays and a pharmacological inhibitor. Results: The majority of candidate genes significantly altered survival fraction following irradiation in both cell lines. Knockdown of NFKB2, RELB, and CDK9 produced the most potent radiosensitization, with sensitizer enhancement ratios of 1.39-1.70 in validation studies, exceeding those of established radiosensitizers including PARP and ATM inhibitors. Notably, knockdown of these genes induced no significant cytotoxicity in the absence of radiation. Pharmacological validation using an IKK inhibitor confirmed these findings, implicating non-canonical NF-{kappa}{beta} signalling and CDK9-dependent transcriptional elongation as critical adaptive mechanisms in GBM radioresistance. Conclusions: ClonoScreen3D-CRISPRi is a scalable, physiologically relevant platform for identifying genetic modifiers of radioresistance. The non-canonical NF-{kappa}{beta} pathway and CDK9 represent promising radiosensitizing targets, and larger screens could enable systematic prioritisation of candidates for clinical translation.

11
Dose-dependent modeling of combinatorial drug responses stratifies patient survival and reveals therapeutic vulnerabilities in precision oncology

Ota, K.; Ito, T.; Shimizu, H.

2026-04-21 cancer biology 10.64898/2026.04.16.718332 medRxiv
Top 0.4%
1.9%
Show abstract

A substantial proportion of cancer patients fail to benefit from their prescribed combination regimens, yet identifying superior alternatives from the vast pharmacological space prior to treatment failure remains an unsolved clinical challenge. Existing computational approaches either rely on multi-omics profiles unavailable in standard oncological practice or reduce drug efficacy to scalar metrics that discard the dose-dependent resolution essential for therapeutic optimization. Here, we present XACT, a hierarchical deep learning framework that reconstructs full dose-dependent drug responses for both monotherapy and drug combinations using only clinically accessible transcriptomic profiles. By leveraging an asymmetric X-Linear Attention mechanism that models second-order interactions between molecular drug substructures and intracellular signaling pathway activities, XACT captures concentration-dependent pharmacodynamics with state-of-the-art accuracy and generalizability to unseen transcriptomic landscapes. When applied to the TCGA pan-cancer cohort, XACT-derived resistance scores were significantly associated with clinical treatment outcomes and stratified overall survival as the strongest independent prognostic factor after multivariate adjustment for tumor stage and cancer type. Systematic virtual screening revealed therapeutic vulnerabilities and nominated alternative regimens for treatment-refractory sarcoma and pancreatic adenocarcinoma. These results establish XACT as a scalable, interpretable, and clinically translatable framework that advances precision oncology from computational prediction toward data-driven therapeutic prescription.

12
Integrated Single-Cell and Spatial Profiling of MMP Gene Expression in Colorectal Cancer

Danese, N. A.; Kurkcu, S. R.; Bleiler, M.; Nito, K.; Kuo, A.; Rosenberg, D. W.; Nakanishi, M.; Giardina, C.

2026-04-21 cancer biology 10.64898/2026.04.17.719089 medRxiv
Top 0.6%
1.7%
Show abstract

Increased matrix metalloproteinase (MMP) expression has long been recognized as a common feature of colorectal cancers (CRCs), yet less is known about how these enzymes interact to impact cancer progression. Taking advantage of single-cell and spatial transcriptomic data, we analyzed the cell-type-specific and spatial expression of MMPs in CRCs. Distinct colon cancer-associated fibroblast (CAF) subtypes were found to express different MMP combinations, including MMP1/3-expressing and MMP11-expressing CAFs. Conversely, myeloid cells (monocytes, macrophages, and dendritic cells) expressed varying levels of the "myeloid MMPs" 9, 12, and 14, which correlated closely with secretory gene expression. Finally, a small population of cancer cells expressed high levels of MMP7. The MMP7-expressing cancer cells frequently co-expressed MMP1, MMP14, and several Wnt-related genes, consistent with a cancer cell type at high risk of malignancy and metastasis. Spatial transcriptomic data showed MMP expression in discernible clusters driven in part by cell-type localization, including fibroblast-heavy stromal regions and inflammatory cell hubs. Epithelial-rich areas showed subregions of MMP7-expressing cancer cells, including areas where cancer cell and myeloid MMP expression overlap. Tumors showed a wide variation in MMP1-expressing CAFs, a variation reflected in primary CAF cell lines. In vitro, MMP1 expression was a stable phenotype that persisted through multiple rounds of division. MMP1-expressing CAFs were frequently positioned at the stromal interface, suggesting a role in facilitating cell movement across the tumor boundary. Our analysis indicates that cell-type and positional MMP expression varies between tumors and may play a role in determining lesion progression and cancer spread.

13
Nanopore Whole-Genome Sequencing for Rapid, Comprehensive Molecular Diagnostics of Brain Tumors in Adult Patients

Halldorsson, S.; Nagymihaly, R. M.; Bope, C. D.; Lund-Iversen, M.; Niehusmann, P.; Lien-Dahl, T.; Pahnke, J.; Bruning, T.; Kongelf, G.; Patel, A.; Sahm, F.; Euskirchen, P.; Leske, H.; Vik-Mo, E. O.

2026-04-24 pathology 10.64898/2026.04.23.26351563 medRxiv
Top 0.7%
1.5%
Show abstract

Background: Classification of central nervous system (CNS) tumors has become increasingly complex, raising concerns about the sustainability of comprehensive molecular diagnostics. We have evaluated nanopore whole genome sequencing (nWGS) as a single workflow to replace multiple diagnostic assays. Methods: We performed nWGS on DNA extracted from 90 adult CNS tumor samples (58 retrospective, 32 prospective) and compared the results to findings from standard of care (SoC) diagnostic work-up. Analysis was done through an automated workflow that consolidated diagnostically and therapeutically relevant genomic alterations, including copy-number variation, structural, and single-nucleotide variants, chromosomal aberrations, gene fusions, and methylation-based classification. Results: nWGS supported final diagnostic classification in all samples with >15% tumor cell content, requiring ~3 hours of hands-on library preparation, parallel sample processing, and sequencing times within 72 hours. Methylation-based classification was available within 1 hour and was concordant with the integrated final diagnosis in 89% of cases (80/90). All diagnostically relevant copy-number variations, single-nucleotide variants, and gene fusions were concordant with SoC testing. MGMT promoter methylation status matched in 94% of cases. In addition, nWGS identified prognostic and potentially actionable variants that were not reported or covered by SoC. Conclusions: nWGS delivers comprehensive genetic and epigenetic results with a fast turn-around compared to standard methods. This enables efficient, accurate, and scalable molecular diagnostics of CNS tumors using a single platform. This data supports its implementation in routine clinical practice and may be extended to other cancer types requiring complex genomic profiling.

14
Onca: An Open 9B Language Model for Pancreatic Cancer Clinical Tasks

Shim, K. B.

2026-04-24 oncology 10.64898/2026.04.16.26351055 medRxiv
Top 0.8%
1.3%
Show abstract

Pancreatic ductal adenocarcinoma (PDAC) remains one of the deadliest solid tumors and continues to face low treatment-trial participation, fragmented evidence workflows, and labor-intensive ab- straction of unstructured clinical text. Existing oncology-focused language models show promise, but many depend on private institutional corpora, limiting reproducibility and practical reuse across centers. We present Onca, an open 9B dense model designed for four PDAC-relevant tasks: trial eligibility screening, case-specific clinical reasoning, structured pathology report extraction, and molecular variant evidence reasoning. Onca is fine-tuned from Qwopus3.5-9B-v3 with a single Un- sloth BF16 LoRA adapter on 37,364 training rows drawn from openly available sources. The evalu- ation spans 11 panels and compares Onca against Woollie-7B, CancerLLM-7B, OpenBioLLM-8B, and the unmodified Qwopus base. Onca achieves the strongest overall results on Trial Screening (81.6 F1), Clinical Reasoning (14.1 composite), Pathology Extraction (30.5 field exact-match), Pub- MedQA Cancer (68.3 macro-F1), and PubMedQA (66.5 macro-F1). The strongest gains appear in tasks closest to routine oncology workflow, especially trial review and pathology structuring. These findings suggest that clinically targeted pancreatic-cancer language models can be built from open data with competitive performance while remaining practical to train on a single workstation-scale GPU setup.

15
Practical Management of Adverse Events Associated with Bispecific Antibodies for the Treatment of Multiple Myeloma: A Qualitative Interview Study

Graham, T. R.; White, M. G.; Blue, B.; Hartley-Brown, M.; Hunter, B. D.; Huynh, C.; Joseph, N.; Keruakous, A.; Pan, D.; Rudolph, P.; Sawhney, R.; Suvannasankha, A.

2026-04-27 oncology 10.64898/2026.04.24.26350878 medRxiv
Top 0.8%
1.3%
Show abstract

PURPOSE: Bispecific antibodies (BsAbs) represent a major advancement in the management of relapsed/refractory multiple myeloma (RRMM), offering high response rates even in heavily pretreated patients. However, their use presents operational, safety, and supportive care complexities that require coordinated care teams, and evolving infrastructure. This manuscript summarizes best practice recommendations for adverse event (AE) management, outpatient operational models, referral pathways, and emerging strategies to optimize long-term tolerability. METHODS: Medlive, A PlatformQ Health Brand, conducted qualitative interviews of academic and community-based clinicians. Discussions focused on BsAb implementation, patient selection and counseling, and AE management. Experts provided recommendations on team-based protocols, transitions of care, and inpatient versus outpatient considerations. RESULTS: Ten hematologists/oncologists (academic n=4; community n=6) described practice patterns, barriers, and perspectives on BsAb use. BsAbs were consistently regarded as highly effective across multiple lines of therapy, particularly for patients without alternatives. Cytokine release syndrome (CRS) was the most common acute toxicity, generally low grade and managed effectively with early tocilizumab, including prophylactic use in outpatient settings. Immune effector cell-associated neurotoxicity syndrome (ICANS) was rare, mild, and best mitigated through early recognition and caregiver support. Infections, largely from BCMA-associated hypogammaglobulinemia, frequently interrupted therapy, necessitating antiviral prophylaxis, pneumocystis jirovecii pneumonia (PJP) prophylaxis, and intravenous immunoglobulin (IVIG). Outpatient step-up dosing is expanding, supported by prophylactic strategies and academic-community collaboration. Timely referral was emphasized to preserving eligibility. Major outpatient challenges included sequencing, infrastructure readiness, and standardized caregiver and staff education. CONCLUSION: Effective community implementation of BsAbs requires multidisciplinary coordination, standardized AE protocols, infection prevention, and infrastructure to support monitoring, referrals, and equitable access. These measures are critical to ensure safe, sustainable integration of bispecific therapies and to optimize patient outcomes.

16
Generalizable Deep Learning Framework for Radiotherapy Dose Prediction Across Cancer Sites, Prescriptions and Treatment Modalities

Chang, H.-h.; Cardan, R.; Nedunoori, R.; Fiveash, J.; Popple, R.; Bodduluri, S.; Stanley, D. N.; Harms, J.; Cardenas, C.

2026-04-22 radiology and imaging 10.64898/2026.04.17.26350770 medRxiv
Top 0.8%
1.2%
Show abstract

Optimizing radiotherapy dose distributions remain a resource-intensive bottleneck. Existing AI-based dose prediction methods often have limited generalizability because they rely on small, heterogeneous datasets. We present nnDoseNetv2, an auto-configured, end-to-end framework for dose prediction across diverse disease sites (head and neck, prostate, breast, and lung), prescription levels (1.5-84 Gy), and treatment modalities (IMRT, VMAT, and 3D-CRT). By integrating machine-specific beam geometry with 3D structural information, the framework is designed to generalize across varied clinical scenarios. A single multi-site model was trained on 1,000 clinical plans. On sites seen during training, performance was comparable to specialized site-specific models. On unseen sites (liver and whole brain), the model outperformed site-specific models, with mean absolute errors of 2.46% and 6.97% of prescription, respectively. These results suggest that geometric awareness can bridge disparate anatomical domains while eliminating the need for site-specific model maintenance, providing a scalable and high-fidelity approach for personalized radiotherapy planning.

17
Development of a Fully Non-Viral 1XX-enhanced BCMA CAR-T Cell Therapy for Multiple Myeloma

Talbot, A.; Li, K.; Lee, J. H. J.; Lang, S.; Liu, C.; Kalter, N.; Li, Z.; Mortazavi, Y.; Almudhfar, N.; Muldoon, J. J.; Allain, V.; Nyberg, W.; Chung, J.-Y. J.; Wang, C.; Qi, Z.; Krishnappa, N.; Ha, A. S.; Kong, D.; Houser, D.; Paruthiyil, S.; Ahmadi, M.; Ji, Y.; Rosenberg, M.; Acevedo, L. A.; Liang, B.; Briseno, K.; Kwek, S. S.; Giannikopoulos, P.; Riviere, I.; Sadelain, M.; Oh, D. Y.; Marson, A.; Hendel, A.; Martin, T.; Eyquem, J.; Shy, B. R.

2026-04-22 cancer biology 10.64898/2026.04.20.719660 medRxiv
Top 0.9%
1.2%
Show abstract

Multiple myeloma (MM) is a clonal plasma cell malignancy characterized by bone marrow infiltration, monoclonal immunoglobulin production, and microenvironmental dysregulation that leads to systemic organ damage. The advent of B-cell maturation antigen (BCMA)-directed chimeric antigen receptor (CAR) T-cell therapy has induced unprecedented responses and durability for patients with relapsed/refractory MM. These outcomes are rarely observed with prior salvage strategies, although relapse remains the predominant long-term challenge for most patients. The two currently approved BCMA CAR-T cell products use viral vectors to semi-randomly insert the CAR gene, which results in heterogeneous genomic composition and variability in efficacy, safety, and product consistency. To address these challenges, we integrated targeted CRISPR genome engineering with precise CAR transgene insertion at the T-cell receptor alpha constant (TRAC) locus, 1XX CAR signaling architecture to enhance potency and durability, and non-viral manufacturing with a single-stranded DNA repair template to improve efficiency and yield. This approach confers physiological CAR expression, reduces insertional mutagenesis, and improves persistence by mitigating tonic signaling and exhaustion. Our GMP manufacturing process consistently achieved high CAR integration (37.7-72.7%) and yields across all full-scale runs and met predefined release criteria for identity, purity, safety, and quality. In NSG mouse models of MM, the UCCT-BCMA-1 product exhibited exceptionally potent tumor control, CAR-T cell expansion 100-1000-fold greater than that of lentiviral constructs, and durable clearance of myeloma cells after multiple rechallenges. These findings establish a CRISPR-edited, fully non-viral manufacturing platform for next-generation 1XX-BCMA CAR-T therapies with enhanced persistence, safety, and efficacy. One Sentence SummaryCRISPR-engineered, TRAC-targeted 1XX-BCMA CAR-T therapy with improved safety, potency, and persistence in relapsed and refractory multiple myeloma.

18
A long-read RNA sequencing and polysome profiling framework reveals transposable element-driven transcript diversity and translational rewiring in glioblastoma

Pizzagalli, M.; Sasipalli, S.; Leary, O.; Tran, L.; Haas, B.; Tapinos, N.

2026-04-21 cancer biology 10.64898/2026.04.18.719388 medRxiv
Top 0.9%
1.2%
Show abstract

BackgroundTransposable elements (TEs) account for over half of the human genome and are often derepressed in cancer. TEs can add cryptic splice sites, undergo exonization, and generate gene-TE fusion transcripts, but the combined effects of TEs on RNA processing and translation in glioblastoma stem cells (GSCs) remains incompletely elucidated. ResultsWe combined long-read RNA sequencing with polysome profiling in four patient-derived GSCs and two neural stem cell (NSC) controls to resolve TE-associated transcript diversity and its relationship to ribosomal engagement. Across GSCs, we identified 13,421 alternative splicing (AS) events, 3,077 of which contained TEs within 150 bp of splice junctions. AS sites proximal to TEs were associated with increased isoform switching compared to non-TE-associated AS sites (odds ratio 2.9 - 4.3). Moreover, AS isoforms generated from TE-proximal sites were more likely to exhibit altered ribosomal association (odds ratio 2.54). Directional shifts were observed, with shorter isoforms associating with monosome fractions and longer isoforms with polysome fractions. To enable systematic detection of gene - TE chimeric transcripts, we developed FuTER (Fusion TE Reporter), a long-read-based framework for identifying TE-associated fusions. Application to GSC datasets identified 78 GSC enriched fusion transcripts, several supported by breakpoint-spanning reads in polysome fractions, consistent with ribosome association. ConclusionsOur data suggest that TEs correlate with abnormal splicing activity and altered ribosome engagement in glioblastoma stem cells. By integrating long-read sequencing with polysome profiling and fusion detection, we establish a framework for analysis of TE-induced transcript diversity and its effects on cancer evolution and plasticity.

19
Beyond Histology: A Validated CUBIC-Based Workflow for Volumetric Analysis of Follicles and Cortical Vasculature in Human Ovarian Tissue

Pavlidis, D. I.; Fischer, C. E.; Jennings, M. A.; Machlin, J. H.; Jan, V.; Baker, B. M.; Shikanov, A.

2026-04-21 bioengineering 10.64898/2026.04.16.718954 medRxiv
Top 0.9%
1.1%
Show abstract

Research questionCan tissue clearing, combined with volumetric imaging, enable reliable, quantitative three-dimensional analysis of follicles and vasculature in intact human ovarian tissue? DesignA CUBIC-based clearing protocol was adapted for human ovarian medulla and cryopreserved cortex. Tissue from reproductive-aged donors was cleared, fluorescently labeled, and imaged using confocal and light sheet microscopy. Tissue expansion, imaging depth, and vascular morphometrics were quantified and follicle density was compared to conventional histology. ResultsClearing produced optically transparent tissue with a linear expansion factor of 1.2 across cortex and medulla. Imaging depth increased 6.5-11-fold in cortex and 6-8-fold in medulla. Follicle density measurements in immunolabeled cleared cortex were comparable to histology, supporting the validity of volumetric follicle quantification. Light sheet microscopy of lectin-labeled cortex revealed no significant donor-to-donor differences in vascular morphometrics, including mean vessel diameters of 12-14 {micro}m, branch point densities of 632-965 points/mm3, vessel length densities of 117-175 mm/mm3, and volume fractions of 1.9-2.3%. Volumetric imaging further illustrated heterogeneous spatial relationships between follicles and surrounding vessels. ConclusionTissue clearing and volumetric imaging complement routine histology and enable quantitative three-dimensional investigation of follicle-vascular interactions in intact human ovarian tissue, providing a framework for advancing fertility preservation and ovarian tissue transplantation research.

20
Practical quantification of immunohistochemistry antigen concentrations and reaction-diffusion parameters

Peale, F. V.; Perng, W.; Mbiribindi, B.; Andrews, B. T.; Wang, X.; Dunlap, D.; Eastham, J.; Ngu, H.; Chernyshev, A.; Orlova, D.

2026-04-21 pathology 10.64898/2026.04.16.719078 medRxiv
Top 1.0%
1.0%
Show abstract

The immunohistochemistry (IHC) methods widely used in diagnostic medicine and biomedical research are kinetically complex reaction-diffusion processes that, ideally, produce stain intensities correlated with the local antigen concentration. Yet after 75 years of use, practical theoretical tools to rigorously plan and interpret IHC experiments are still lacking. Because modeling the reactions requires time-consuming computer simulation, impractical for regular use, most protocols are optimized empirically, without detailed knowledge of the reaction rates and antigen-antibody equilibria. The resulting stain intensities can be calibrated against standards with known antigen abundance, but they are typically not interpretable in terms of chemical antigen concentrations. To address these limitations, we developed a fast interpolation method to model reaction-diffusion behavior, and experimental methods to characterize IHC kinetic parameters in formalin-fixed paraffin-embedded (FFPE) samples. Used together, these allow experimental measurement of both the chemical concentration of antigen in the sample and the reaction-diffusion parameters consistent with the assay results. Results show 1) direct immunofluorescent detection has low nanomolar sensitivity with >1000-fold dynamic range, and 2) antibody diffusion rates in FFPE samples can be >1000-fold slower than in aqueous solutions, producing diffusion-limited conditions in which the IHC reaction time course may depend on the sample antigen concentration. Awareness of these details is necessary to avoid potential underestimation of both the absolute and relative antigen concentrations in different samples that may occur if staining is stopped before reaching equilibrium. Software tools are provided to allow users to rapidly model IHC reaction time courses and to fit experimental time course data with candidate reaction parameters. The principles described here apply equally to other tissue-based "spatial omics" analyses and should be considered when designing and interpreting experiments requiring any macromolecule to diffuse into and react in a tissue section. SIGNIFICANCEThe theoretical and experimental framework described here advances IHC staining from a qualitative or semi-quantitative method towards a more rigorously quantitative assay. The practical ability to predict IHC reaction kinetics and fit reaction parameters to experimental data has the potential to advance IHC applications in diagnostic medicine and biomedical research in three ways: 1) interpretation of experimental and diagnostic samples stained under different conditions can be more objective, facilitating comparison of results from different protocols and different laboratories; 2) IHC staining can be interpreted as molar chemical antigen-antibody concentrations calculated from the reaction parameters measured in the studied sample; 3) the correlation between antigen concentration and biological behavior can be examined more reliably. Practical software tools are provided.